using Pkg; Pkg.activate("../..") Activating project at `~/Developer/research/SMCP3OpenSource/GenSMCP3.jl`
This notebook walks introduces SMCP3, by walking through the development of an SMCP3 algorithm for the inference problem used as the motivating example in this talk on SMCP3. You may find it useful to watch the beginning of the talk for a quick introduction to the problem setting, and to see the visualizations we will produce in this notebook.
Sometimes when running this notebook I found that Jupyter gave error messages relating to displaying the animations run in this notebook. I was able to solve this problem by restarting Jupyter, with the following flag:
jupyter notebook --NotebookApp.iopub_msg_rate_limit=10e8
Now, let’s dive right in!
using Pkg; Pkg.activate("../..") Activating project at `~/Developer/research/SMCP3OpenSource/GenSMCP3.jl`
# Include this during development so that changes in the static code files
# are loaded immediately by the Julia instance.
using ReviseWe consider a probabilistic model in which an object performs a random walk in \(\mathbb{R}^D\). That is, we set \(z_0 = \vec{0}\), and for every \(t > 0\), we have \(z_t \sim \mathcal{N}(z_{t-1}, \sigma_z I_D)\), where \(I_D\) is the \(D\)-dimensional identity matrix.
Under this probabilistic model, we assume that at every timestep, we receive a noisy observation of the object’s position, \(y_t\). Specifically, \(y_t \sim \mathcal{N}(z_t, \sigma_y I_D)\).
The goal of probabilistic inference in this model will be: given a stream of noisy observations, \(y_{1:T} = (y_1, y_2, \dots, y_T)\), understand what latent trajectory the underlying object may have moved along, to produce those observations. Specifically, we will characterize the posterior distribution \(P(z_{1:t} = \cdot | y_{1:t})\).
(In this particular model, inference can be done exactly using a Kalman Filter. In this notebook, we will develop an approximate inference algorithm using SMCP3, which can be used to perform inference in this model, and a wide range of models with differentiable likelihood functions – including models which are not differentiable. While SMCP3 is not needed for this particular toy random-walk model, since Kalman Filtering already suffices, we use this model in our tutorial because it is very simple.)
Below, we define this probabilistic model as a Gen probabilistic program. First, we define global variables for \(\sigma_x\) and \(\sigma_y\).
# Global variables σ_z and σ_y controlling the random walk speed and the observation noise
Z_STD() = 1.
Y_STD() = 1.Y_STD (generic function with 1 method)
Then, we define a Gen probabilistic program specifying what new values become instantiated by the model at each timestep.
using Gen
@gen (static, diffs) function step_model(t, zₜ₋₁)::Float64
# We use Gen's `broadcasted_normal` distribution.
# If μ is a D-dimensional vector, then `broadcasted_normal(μ, σ)`
# is the Gaussian distribution with mean μ and covariance matrix σ*I,
# where I is the D-dimensional identity matrix.
# The latent state evolves according to a random walk.
zₜ ~ broadcasted_normal(zₜ₋₁, Z_STD())
# A noisy measurement of the latent state is observed.
yₜ ~ broadcasted_normal(zₜ, Y_STD())
# Output the new latent position, to be fed in at the next
# timestep as the value zₜ₋₁.
return zₜ
endvar"##StaticGenFunction_step_model#329"(Dict{Symbol, Any}(), Dict{Symbol, Any}())
Then, we define a Gen probabilistic program which simulates from this step model for T timesteps. To do this, we use Gen’s Unfold combinator.
@gen (static, diffs) function model(D, T)
z₀ = zeros(D) # D-dimensional vector of zeros
steps ~ Unfold(step_model)(T, z₀)
return steps
endvar"##StaticGenFunction_model#341"(Dict{Symbol, Any}(), Dict{Symbol, Any}())
For performance, we wrote the above probabilistic programs using Gen’s Static probababilistic program writing DSL, which can compile the probabilistic programs to speed up probabilistic inference. We must call the following macro so Gen knows these probabilistic programs are ready to be compiled when they are needed.
@load_generated_functions()Now, let’s generate a random latent trace from this probabilistic model. We’ll set \(D=2\), so we can visualize the object as moving around in the plane.
## Generate a random trace from the model
trace_1 = Gen.simulate(
model,
( # Arguments to pass into the probabilistic model
2, # D = 2 (2-dimensional simulation)
4 # T = 50 (simulate 50 timesteps)
)
);This simulation generated a “trace” from the probabilistic model, containing all the \(z_t\) and \(y_t\) values from one simulation of the random walk model. We can have Gen print out the collection of all the random choices stored in this trace:
get_choices(trace_1)│
└── :steps
│
├── 1
│ │
│ ├── :zₜ : [0.42602834835642395, -1.6296511028643599]
│ │
│ └── :yₜ : [0.9898224428399893, -2.8982906037536074]
│
├── 2
│ │
│ ├── :zₜ : [-0.6509108574990858, -1.1431379512283615]
│ │
│ └── :yₜ : [0.3359904091532965, -0.8623329421862657]
│
├── 3
│ │
│ ├── :zₜ : [-0.07516319780998115, -1.9946561099455804]
│ │
│ └── :yₜ : [-0.9674520819950085, -2.5506208955840632]
│
└── 4
│
├── :zₜ : [0.27923589356757517, -1.8163051932449021]
│
└── :yₜ : [1.1333993244926113, -0.8451504948226879]
We can get any specific value by indexing into the trace with the proper address:
trace_1[:steps => 2 => :zₜ]2-element Vector{Float64}:
-0.6509108574990858
-1.1431379512283615
We’ll also define the following helper function:
function get_x(trace, t)
if t == 0
D = get_args(trace)[1]
return zeros(D)
else
return trace[:steps => t => :zₜ]
end
endget_x (generic function with 1 method)
Now, let’s generate a trace with 50 timesteps.
trace_50 = Gen.simulate(model, (2, 50));
observations = [trace_50[:steps => t => :yₜ] for t=1:50]
observations50-element Vector{Vector{Float64}}:
[1.075781171993675, 0.5225368039979145]
[0.30648408399112825, 2.6919531435646293]
[3.589285043378361, 4.260725247584654]
[2.675026674859555, 4.967662431016679]
[1.6583667978660914, 4.296741160531252]
[1.1718302690537648, 5.289505776726546]
[4.505891059931557, 7.168509410275103]
[4.7990351702707414, 9.124818585741401]
[2.021747269236077, 10.049824043142813]
[-0.8067310229026994, 8.983241331957604]
[0.3177925348836881, 10.441901258947876]
[-2.3241175562292202, 12.122446151183185]
[-2.5155520413920485, 12.788072839912545]
⋮
[-7.692543268353392, 26.142249424902793]
[-8.80023713015943, 26.140885325106478]
[-9.88698620514221, 30.124997156565595]
[-12.148610248957382, 30.943176843558824]
[-10.947012446979949, 28.999770034910725]
[-9.693094762078506, 27.27337720641297]
[-9.099415725898062, 30.22160659552356]
[-6.819114115075519, 30.35073725430037]
[-10.267834547688844, 30.080865234026678]
[-11.258965660009597, 27.675840947941765]
[-8.304412583048862, 27.672659838267695]
[-6.45955901483878, 26.25244062975913]
And let’s visualize what the latent trajectory and observations from this trace looked like.
# Include the visualization code
includet("../utils/visualize/visualize.jl")# Global variable: visualize traces on an axis with the X and Y axes bounded between -15 and 25.
VIZ_DOMAIN() = ((-15, 25), (-15, 25))VIZ_DOMAIN (generic function with 1 method)
(figure_1, animated_time_1) = plot_trajectory_and_observations(
VIZ_DOMAIN(),
t -> observations[t],
t -> trace_50[:steps => t => :zₜ]
);
figure_1